---
title: "Audio"
description: "REST endpoints for audio streaming and text-to-speech"
---

## Get Recent Audio

Retrieve the last 10 seconds of audio data from a user session.

<Warning>
This endpoint is restricted to the `com.augmentos.shazam` package only.
</Warning>

### Endpoint

<CodeGroup>
```bash Production
GET https://api.mentra.glass/api/audio/:userId
```

```bash Development
GET https://devapi.mentra.glass/api/audio/:userId
```

```bash Local
GET http://localhost:8002/api/audio/:userId
```
</CodeGroup>

<Warning>
The code shows this endpoint is incorrectly defined as `/api/audio/:userId` at line 47, but it should be just `/audio/:userId` since the router is mounted at `/api`.
</Warning>

### Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `userId` | string | Target user ID (in URL) |

### Query Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `apiKey` | string | Yes | App API key |
| `packageName` | string | Yes | Must be `com.augmentos.shazam` |
| `userId` | string | Yes | Target user ID (same as URL parameter) |

### Response

Success (200):
- Binary audio data stream
- Content-Type: `application/octet-stream`
- Format: PCM audio buffer (concatenated audio chunks)

Error (401):
```json
{
  "success": false,
  "message": "Invalid API key." // or "Authentication required. Provide apiKey, packageName, and userId."
}
```

Error (403):
```json
{
  "success": false,
  "message": "Unauthorized package name"
}
```

Error (404):
```json
{
  "error": "Session not found" // or "No audio available", "No decodable audio available"
}
```

Error (500):
```json
{
  "error": "Error fetching audio"
}
```

### Implementation

- **File**: `packages/cloud/src/routes/audio.routes.ts:47-91`
- **Middleware**: `shazamAuthMiddleware` - Validates package and API key
- **Service**: Uses `AudioManager.getRecentAudioBuffer()`

### Authorization

- Only `com.augmentos.shazam` package is allowed
- Requires valid API key for the package
- Must specify target user ID in both URL and query parameters

### Audio Processing

- Returns buffered audio from `userSession.audioManager.getRecentAudioBuffer()`
- Audio chunks are concatenated into single buffer
- LC3 codec support is commented out but planned for future

## Text-to-Speech

Convert text to speech using ElevenLabs API.

### Endpoint

<CodeGroup>
```bash Production
GET https://api.mentra.glass/api/tts
```

```bash Development
GET https://devapi.mentra.glass/api/tts
```

```bash Local
GET http://localhost:8002/api/tts
```
</CodeGroup>

<Warning>
The code shows this endpoint is incorrectly defined as `/api/tts` at line 94, but it should be just `/tts` since the router is mounted at `/api`.
</Warning>

### Query Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `text` | string | Yes | Text to convert to speech |
| `voice_id` | string | No | ElevenLabs voice ID (uses default if not provided) |
| `model_id` | string | No | TTS model (defaults to `eleven_flash_v2_5`) |
| `voice_settings` | JSON string | No | Voice customization settings |

### Response

Success (200):
- Audio stream
- Content-Type: `audio/mpeg`
- Streaming MP3 audio data
- Connection: `keep-alive`

Error (400):
```json
{
  "success": false,
  "message": "Text parameter is required and must be a string" // or other validation errors
}
```

Error (500):
```json
{
  "success": false,
  "message": "TTS service not configured" // or "Internal server error"
}
```

### Voice Settings Example

```json
{
  "stability": 0.5,
  "similarity_boost": 0.5,
  "style": 0.5,
  "use_speaker_boost": true
}
```

### Implementation

- **File**: `packages/cloud/src/routes/audio.routes.ts:94-223`
- **Service**: Proxies to ElevenLabs API
- **Streaming**: Streams response directly to client using fetch API

### Configuration

Requires environment variables:
- `ELEVENLABS_API_KEY`: Your ElevenLabs API key
- `ELEVENLABS_DEFAULT_VOICE_ID`: Default voice to use (optional if voice_id provided)

### Example Request

```
GET /api/tts?text=Hello%20world&voice_id=21m00Tcm4TlvDq8ikWAM
```

### ElevenLabs Integration

- API endpoint: `https://api.elevenlabs.io/v1/text-to-speech/{voice_id}/stream`
- Requires `xi-api-key` header for authentication
- Supports streaming response for low latency

## Error Codes

| Code | Description |
|------|-------------|
| 400 | Invalid parameters or voice settings |
| 401 | Authentication required or invalid API key |
| 403 | Unauthorized package name (audio endpoint only) |
| 404 | Session not found or no audio available |
| 500 | Internal server error or TTS service not configured |

## Notes

- Audio endpoint is restricted to Shazam app for music recognition
- TTS endpoint is publicly accessible but requires ElevenLabs configuration
- Audio is buffered and retrieved from AudioManager
- TTS responses are streamed for low latency
- Both endpoints have incorrect route definitions that include `/api` prefix when they shouldn't